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t-H Abstract 

| We study productivity dispersions across workers, firms and indus- 

trial sectors. Empirical study of the Japanese data shows that they 
all obey the Pareto law, and also that the Pareto index decreases with 
the level of aggregation. In order to explain these two stylized facts, 
we propose a theoretical framework built upon the basic principle of 
statistical physics. In this framework, we employ the concept of super- 
statistics which accommodates fluctuations of aggregate demand. 
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I. Introduction 

On 

The standard economic analysis postulates that the marginal products of a 
production factor such as labor are all equal across firms, industries, and 
sectors in equilibrium. Otherwise, there remains a profit opportunity, and 
this contradicts to the notion of equilibrium. Factor endowment, together 
with preferences and technology, determines equilibrium in such a way that 
• • the marginal products are equal across sectors and firms. 
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A bold challenge to this orthodox theory was given by Keynes ( 1936 1. He 
pointed out that the utilization of production factors is not full, and, therefore, 
that factor endowment is not an effective determinant of equilibrium. In the 
General Theory, Keynes identified the less than full utilization of production 
factor with the presence of involuntary unemployment of labor. Much contro- 
versies revolved around the theoretically ambiguous notion of "involuntary" 
unemployment. However, for Keynes' economics to make sense, the existence 
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of involuntary unemployment is not necessary. More generally, it is enough 
to assume that there is underemployment in the economy in the sense that 
the marginal products are not uniform at the highest level. In effect, what 
Keynes said is that in the demand-constrained equilibrium, productivity of 
production factor differs across industries and firms. 

There are, in fact, several empirical findings which strongly suggest that 
productivity dispersion exists in the economy. The celebrated Okun's Law is 
an example. The standard assumption on the neoclassical production func- 
tion entails that the elasticity of output with respect to labor input (or the 



unemployment rate) is less than one. Okun (19621, however, found this elas- 



ticity to be three for the U.S. economy. This finding turns out to be so robust 
that it has eventually become known as the Okun's Law. 

Okun attempts to explain his own finding by resorting to several factors. 
They are (a) additional jobs for people who do not actively seek work in a 
slack labor market but nonetheless take jobs when they become available; (b) 
a longer workweek reflecting less part-time and more overtime employment; 
and (c) extra productivity. On cyclical changes in productivity, he argues as 
follows: 

"I now believe that an important part of the process involves 
a downgrading of labor in a slack economy — high-quality workers 
avoiding unemployment by accepting low-quality and less produc- 
tive jobs. The focus of this paper is on the upgrading of jobs as- 
sociated with a high-pressure economy. Shifts in the composition 
of output and employment toward sectors and industries of higher 
productivity boost aggregate productivity as unemployment de- 
clines. Thus the movement to full employment draws on a reserve 
army of the underemployed as well as of the unemployed. In the 
main empirical study of this paper, I shall report new evidence 
concerning the upgrading of workers into more productive jobs in 
a high-pressure economy. (Okun (1962, p. 208))" 

Okun well recognized and indeed, by way of his celebrated law, demonstrated 
that there exists dispersion of labor productivity in the economy. 

Another example is wage dispersion, an observation long made by labor 
economists. Mortensen (20031, for example, summarizes his analysis of wage 



dispersion as follows. 

"Why are similar workers paid differently? Why do some jobs 
pay more than others? I have argued that wage dispersion of this 
kind reflects differences in employer productivity. ... Of course, 
the assertion that wage dispersion is the consequence of produc- 
tivity dispersion begs another question. What is the explanation 
for productivity dispersion? (Mortensen (2003, p. 129))" 

To this question, Mortensen's explanation is as follows: 
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"Relative demand and productive efficiency of individual firms 
are continually shocked by events. The shocks are the consequence 
of changes in tastes, changes in regulations, and changes induced 
by globalization among others. Another important source of per- 
sistent productivity differences across firms is the process of adopt- 
ing technical innovation. We know that the diffusion of new and 
more efficient methods is a slow, drawn-out affair. Experimen- 
tation is required to implement new methods. Many innovations 
are embodied in equipment and forms of human capital that are 
necessarily long-lived. Learning how and where to apply any new 
innovation takes time and may well be highly firms specific. Since 
old technologies are not immediately replaced by the new for all 
of these reasons, productive efficiency varies considerably across 
firms at any point in time. (Mortensen (2003, p. 130))" 

ft is important to recognize that the processes Mortensen describes are 
intrinsically stochastic. Because there are millions of firms in the economy, 
it is absolutely impossible for economists to pursue the precise behavior of 
an individual firm although no doubt each firm tries to maximize profits, 
perhaps dynamically, under certain constraints. Therefore, we must explore 
such stochastic dynamics in the economy as a whole by different method. In 
this paper, we pursue such a stochastic approach. 

The "stochastic approach" was indeed once popular in diverse areas of 
study such as income distribution and firm (city) size: Primary examples are 
Champernowne (19731 and |Ijiri and Simon (1977]). However, the stochastic 



approach eventually lost its momentum. According to Sutton (19971, the 
reason is as follows: 



"It seems to have been widely felt that these models might fit 
well, but were "only stochastic." The aim was to move instead 
to a program of introducing stochastic elements into conventional 
maximizing models. (Sutton (1997, p. 45))" 

This trend is certainly in accordance with the motto of the mainstream 
macroeconomics. However, Sutton, acknowledging the importance of the 
stochastic approach, argues thatQ 

"a proper understanding of the evolution of structure may re- 
quire an analysis not only of such economic mechanisms, but also 
of the role played by purely statistical (independence) effects, and 
that a complete theory will need to find an appropriate way of 
combining these two strands. (Sutton (1997, p. 57))" 

1 Incidentally, |Sutton| (l997| l, evidently having the central limit theorem in mind, iden- 
tifies what he calls "purely statistical" effects with the effects of independent stochastic 
variables. However, the method of statistical physics is effective even when stochastic vari- 
ables are correlated, that is they are not independent. 
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Actually, as the system under investigation becomes larger and more com- 
plicated, the importance of the stochastic approach increases. In fact, it is 
almost the definition of macro-system to which the basic approach based on 
statistical physics can be usefully applied. In natural sciences such as physics, 
chemistry, biology and ecology, the "stochastic approach" is routinely taken 
to analyze macro-system. Why not in macroeconomics? 

In what follows, we first explain the concept of stochastic macro- equilibrium. 
Next, we present an empirical analysis of productivity dispersion across the 
Japanese firms. Then, we present a theory which can explain the observed 
distribution of productivity. The theoretical analysis which we offer follows 
the standard method of statistical physics, and abstracts from microeconomic 
analysis of the behavior of individual agent. We believe that as Sutton (1997) 
suggests, our analysis complements the standard search theoretical approach 
which pays careful attention to microeconomic behavior. Finally, we discuss 
implications of our analysis and a possible direction of future research. 



II. The Stochastic Macro-equilibrium 



Tobin (19721 proposed the concept of stochastic macro- equilibrium in his at- 



tempt to explain the observed Phillips curve. He argues that 

"[it is] stochastic, because random intersectoral shocks keep in- 
dividual labor markets in diverse states of disequilibrium; macro- 
equilibrium, because the perpetual flux of particular markets pro- 
duces fairly definite aggregate outcomes. (Tobin (1972, p. 9))" 

His argument remained only verbal. However, as we will see it shortly, the 
fundamental principle of statistical physics, in fact, provides the exact foun- 
dations for the concept of macro-equilibrium. A case in point is productivity 
dispersion in the economy. 

We consider the economy in which there are K firms with their respective 
productivities c±, C2, . . . . Without loss of generality, we can assume 

(1) ci < c 2 < • • ■ < c K . 

Labor endowment is AT. We distribute workers to the fc-th firm. Thus, the 
following equality holds: 

K 



(2) Y, n * = N - 



k=l 

The neoclassical theory takes it for granted that n# is N while rik (k ^ 
K)'s are all zero. Instead, in what follows, we seek the most probable distribu- 
tion of productivity across workers under suitable constraints]^] The possible 

2 The productivity Cfe corresponds to allowed energy level, and workers to distinguishable 
particles in statistical physics. 
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number of a particular configuration (rii, n®, . . . , tik), W{ n \ is 



Nl 

(3) W M = „ K .• 

Because the total number of possible configurations is K N , the probability of 
occurrence of such a configuration, P{ n }i on the assumption of equal proba- 
bilities for all configurations, is given by 



(4) P{n} 



Nl 



K N nfL rul 



i=i 



Following the fundamental principle of statistical physics, we postulate that 
the configuration {n\, n^, . . . , nx } that maximizes P{ n } under suitable con- 
straints is realized in equilibrium. The idea is similar to the method of maxi- 
mum likelihood in statistics or econometrics. 

It is extremely important to recognize that this analysis is consistent with 
the standard assumption in economics that economic agents maximize their 
objective functions. As each particle satisfies the Newtonian law of motion, 
and maximizes the Hamiltonian in physics, in the economy, to be sure, all 
the micro agents optimize. However, constraints facing these micro agents, 
and even their objective functions keep changing due to idiosyncratic shocks. 
The situation is exactly the same as in physics where we never know the 
initial conditions for all the particles. In the economy, the number of micro 
agents is less than the counterpart in physics which is typically 10 23 , but 
still there are 10 6 firms and 10 7 households! It is simply impossible and 
meaningless to analyze micro behavior of agent in great detail. For studying 
macro-system, we must take behaviors of micro agents as stochastic even 
though their behaviors are purposeful. It is the fundamental principle of 
statistical physics that we observe the state of macro-system which maximizes 
the probability, Q. 

Toward our goal, we define the following quantity; 

InP 1 / K \ K 

(5) S= ^^+lnK ^ - InIuN -J^nilnnA = -^Pk^Pk, 

\ k=l / k=l 

where pk is defined as 

ta\ n k 

(6) Pk = ^. 

Here, we assume that N and rifc's are large, and apply the Stirling formula: 

(7) In ml ~ m In m — m for m 1 . 

The quantity S corresponds to the Boltzmann-Gibbs entropy. Note that the 
maximization of S is equivalent to that of P{ n }- 
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We maximize S under two constraints. One is the normalization condition 

K 

(8) E^ = 1 - 

fc=i 

This is, of course, equivalent to the resource constraint, 53feLi n k — N. The 
other constraint requires that the total output (GDP) Y is equal to the ag- 
gregate demand D, 

K 

(9) Nj2c k Pk=Y = D. 

k=l 

For convenience, we define aggregate demand relative to factor endowment, 
D as follows: 

(10) D = §. 

In what follows, we simply call D the aggregated demand. As we will shortly 
see it, the aggregate demand, D determines the state of stochastic macro- 
equilibrium. 

We maximize the following Lagrangian form: 



(11) S-a > Pk -1 




Differentiating (111 with respect to pi, we obtain 

(12) lnp k + (1 + a) + Pc k = 0. 
This yields 

(13) p fc = e- (1+Q) e-' 9cfc . 

The normalization condition or the resource constraint, Q determines a so 
that the distribution we seek is obtained as follows: 

1 



(14) Pk 



Z(P) 



-0c k 



Thus, the distribution which maximizes the probability, Q under two con- 
straints, ^ and |9]) is exponential. We may call it the equilibrium distribu- 
tion. The exponent of the equilibrium distribution is the Lagrangian multiplier 
13 corresponding to the aggregate demand constraint in (11). 

This exponential distribution is called the Maxwell-Boltzmann distribution. 
Here, Z(f3) is what is called the partition function in physics, and makes sure 
that pk's sum up to be one. 



A" 



(15) Z(P) = J2 



k=l 
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Figure 1: Productivity Distribution Predicted by the 
Stochastic Macro-equilibrium Theory for Two 
Different Values of the Aggregate Demand, D, 
D± > D2 Notes: The values of /3 are obtained from equa- 
tion (ph for D 1 = 20 and D 2 = 100. 



The exponent p is equivalent to the inverse of temperature, 1/T in physics. 
Equations (JoJ) and (14 1 yield the following: 



(16) 



D = 



Z(f3) 



K 

£ 

k=l 



c k e 



-He 



dp 



lnZ(/3). 



This equation relates the aggregate demand, D to the exponent of the dis- 
tribution (3 by way of the partition function, Z(/3). Note that at this stage, 
the distribution of productivity Cfc is arbitrary. Once it is given, the partition 
function Z(J5) is defined by equation) 15 1, and it, in turn, determine^ the 
relation between D and (3. 

Suppose that the distribution of productivity across firms is uniform, that 
is, Ck — kA c , where A c is a constant. Then, from (14.1, we obtain 



K 



(17) 



2(0) = £• 



-/3fcA c 



1 



,-0(K-l)A c 



k=X 



1 



under the assumption that /3A C <C 1 and f3KA c ^> 1. Therefore, in this case, 



we know from equations (151 and (16 1, that the exponent of the equilibrium 



exponential distribution, /3 is equal to the inverse of aggregate demand: 



(18) 



1 



3 This is exactly the same as in the standard analysis in statistical physics where the 
relation between the average energy (D) and the temperature (T = is determined 

once the energy levels (e^) are given for the system under investigation. 
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Because the inverse of j3 is temperature, in this case, aggregate demand is 
equivalent to temperature. 

In summary, under the assumption that the productivity dispersion across 
firms is uniform, the equilibrium distribution of productivity across workers 
becomes exponential, namely the Maxwell-Boltzmann distribution with the 
exponent equal to the inverse of the aggregate demand, D (Yoshikawa (2002), 
and Aoki and Yoshikawa (2007)): 

When the aggregate demand, D is high, the distribution becomes flatter mean- 
ing that production factors are mobilized to firms or sectors with high pro- 
ductivity, and vice versa. Figure [l] shows two distributions of productivity 
corresponding to high aggregate demand D\ and low D 2 - 

The present analysis provides a solid foundation for Okun's argument that 
in general, production factors are under-utilized, and that workers upgrade 
into more productive jobs in a "high-pressure" economy. So far, so fine. 
However, the Boltzmann-Gibbs distribution is precisely exponential. Does it 
really fit our empirical observation? We next turn to this question. 



III. Stylized Facts 

In this section, we empirically explore how productivity is actually distributed. 
Before we proceed, we must point out that what we observe is the average 
productivity c, defined by c = Y/L. where Y is the output, and L, the labour 
input. What matters theoretically is, of course, the unobsorbed marginal 
productivity Cm defined by 

dY 

(20) c M =-. 

We will shortly discuss the relationship between the distributions of these two 
different productivities. There, we will see that the difference between c and 
Cm does not affect our results on the empirical distributions. 

Another problem is that we measure L in terms of the number of workers, 
as such data is readily available. One might argue that theoretically, it is 
more desirable to measure L in terms of work-hours, or for that matter, even 
in terms of work efficiency unit. The effects of this possible "measurement 
errors" can also be handled in a similar way which we will explain in subsection 
B. That is, we can safely ignore the measurement error problem as well for 
our present purpose. 



A. Empirical Distributions 

We use the Nikkei-NEEDS database, which is a major representative database 
for the listed firms in Japan. The details of this database is given in |Appcndix| 
[A") together with a brief description of the data-fitting. 
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We study productivity distributions at three aggregate levels, namely, 
across workers, firms, and industrial sectors. In order to explain empirical 
distributions, we find continuous model more convenient than discrete model. 
Accordingly, we define, for example, the probability density function of firms 
with productivity c as P^(c). The number of firms with productivity be- 
tween c and c+dc is now KP^ ¥ \c)dc. It satisfies the following normalization 
condition: 

/>oo 

(21) / P^(c)dc=l. 

Jo 

In section II, we explained that equilibrium productivity dispersion across 
workers becomes the exponential distribution under the assumption that the 
distribution of productivity across firms is uniform. Here, P*- F )(c) is not re- 
stricted to be uniform. Rather the distribution P( F )(c) is to be determined 
empirically. We analogously define p( w \c) for workers, and P( s )(c) for indus- 
trial sectors. We denote their respective cumulative probability distributions 

, D (S,F,W), n 
by P> '(c): 

/DO 
P ( *\c')dc' (* = S,F,W). 

Because the cumulative probability is the probability that a firm's (or worker's 
or sector's) productivity is larger than c, it can be measured by rank-size 
plots, whose vertical axis is the [rank] /[the total number of firms] and the 
horizontal axis c. The rank-size plots has advantages that it is free from 
binning problems which haunt probability density function (pdf) plots and 
also that it has less statistical noises. For these reasons, in what follows, we 
will show the cumulative probability rather than probability density function. 

The productivity distribution across firms, to be exact, log P> (c) ob- 
tained from the Nikkei- NEEDS data is plotted in Figure [2] for the year 2005. 
The dots are the data points; Each dot corresponds to a firm whose position 
is determined from its rank and the productivity c. For reference, the power, 
exponential, and log-normal distributions are shown in the diagram. The ex- 
ponential and log-normal distributions are represented by respective curves 
while the power-law by a straight line whose slope is equal to the power ex- 
ponent. 

Evidently, the uniform distribution implicitly assumed in the analysis in 
section II does not fit the actual data at all. For small c (low-productivity) , the 
log-normal law (dash-dotted curve) fits well, and for large c (high-productivity) , 
the power-law (broken line) fits well, with smooth transition from the former 
to the latter at around logc ~ 2. The result shown in Figure [2] is for the year 
2005, but the basically same result holds good for other sample periods. 

Cumulative probability P> (c) for large c can be, therefore, represented 
by the following power distribution: 

(23) Pf } (c)^ (~j ^ (c»c ). 
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1 1.5 2 2.5 3 

logc 

Figure 2: Productivity Distribution across Firms (2005) 
Notes: The productivity c is in the unit of 10 6 yen/person. 
The best fits for the exponential law and the power law 
is obtained for 10 < c < 3000. 
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Figure 3: Productivity Distribution across Business Sectors 
(2005) 



where Co is a parameter that defines the order of the productivity c|_J The 
power exponent fip is called the Pareto index, and Co, the Pareto scale. The 
probability density function is then given by the following: 

(24) P< F )(c) = -fpf>( C )^ F ^^ (c»Co). 

ac c a 

Next, we explore the productivity dispersion across industrial sectors. The 
Nikkei-NEEDS database defines 33 sectors. The productivity distribution 

4 We note that cq has the same dimension as c, so that P> (c) is dimensionless, as it 
should be. 
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Figure 4: Productivity Distribution across Workers (2005) 



across these sectors for the year 2005 is plotted in Fig(3] We observe that 
once again, the power-law (a straight line in the diagram) fits the data pretty 
well. The same result holds true for all the sample years. 

Finally, the productivity distribution across workers is plotted in Figure 
[4] for the year 2005. Again, we observe that the power-law (broken line) fits 
the data very well for large c. A casual observation of Figure [4] may make one 
wonder whether the share of workers for which the power-law fits well may be 
small. This impression is wrong. In fact, the fitted range is approximately, 
say, log P> (c) G [—0.4, —3.1], which translates to the rank of the workers 

[10~ - 4 , 10~ 31 ]x [Total number of workers] ~ [1.52 x 10 6 ,3.04 x 10 3 ] 

This means that some 1.52 million workers, that is, 39% of all workers fit the 
power-law. 

The values of the Pareto indices obtained are shown in Figure [5] for three 
levels of aggregation. We note that the Pareto index decreases as the aggre- 
gation level goes up from workers to firms, and from firms to the industrial 
sectors. 



B. Marginal vs. Average Productivity 

Before we conclude this section, we digress into the relationship between the 
marginal and average productivities of labor. Theoretically, the marginal 
productivity matters while what we observe and, therefore, used in the above 
analysis is the average productivity. To explore the relation between the two 
productivities, c and cm, we assume the Cobb-Douglas production function: 

(25) Y = AK 1 - a L a (0<a<l). 



Equation (25 1 leads us to the following relation: 
(26) cm = etc. 
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Figure 5: Pareto Indices of the Productivity Distribu- 
tions across Workers, Firms, and Industrial 
Sectors 



In general, the value of a differs across firms. Therefore, the distribution of 
Cm is, in general, different from that of c. 

However, thanks to (26 1, we can relate the pdf of cm, Pm(cm) to the joint 
pdf of c and a, P c , Q (c, a) as follows: 



Pm(cm) 



/>1 />QO 

da dc5(cM — &c)P c , a (c,a) 
Jo Jo 



(27) 



1 da 



Pr 



a 



cm 



In general, P CiQ (c, a) can be written as follows, 
(28) P c = p( p )(c)P( a c). 

Here, P(a\c) is the conditional pdf; it is the pdf of a for the fixed value of 
productivity c. It is normalized as follows: 



(29) 



/ P{a\c)da = l, 
Jo 



for any value of c. We have already seen that p( F )(c) obeys the power-law for 
large c: 



(30) 



P( p )(c) = 



From (30 1, (28 1 and (27 1, we obtain the following: 

-mf— l 



(31) 



-Pm( c m) — fJ-F 



M 



daa~^P a 



cm 
a 
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Figure 6: Rank-Size Log-log Plots of the Simulation Re- 
sults Notes: Distribution for c is for the Monte-Carlo 
data generated from a P(c) with /j,m = 1.5. Data for cm 
is then created with P a (a) distributed from a — 0.5 to 1 
uniformly. 




Here, we assumed that P(a | c) does not extend too near a = so that cyi/ot 



stays in the asymptotic region. From (31), we can conclude that if a and c 
are independent, namely 

(32) P(a\c) = P a (a), 

then, the distribution of marginal productivity, Pm(cm) also obeys the power 
law with the identical Pareto index fip as for the average productivity: 

(33) Pm(cm) oc c m Mf_1 . 

In conclusion, to the extent that a and c are independent, the distribution 
of the unobserved marginal productivity obeys the power-law with the same 
Pareto index as for the observed average productivity. A sample Monte-Carlo 
simulation result is shown in Figure [6j Because this is the log-log plots, the 
gradient of the straight region is equal to the Pareto index. The equality of 
the Pareto indices is clearly seen. 

In our analysis, L is the number of workers. Theoretically, it would be 
desirable to measure L in terms of work-hours or for that matter, even in 
terms of work efficiency. We can apply the above analysis to the case where 



the problem is measurement error by simply interpreting a in ( 26 ) as such 
measurement error rather than a parameter of the Cobb-Douglas production 
function. Thus, to the extent that the average productivity c and measure- 
ment error a are independent, the distribution of "true" productivity obeys 
the power-law with the same exponent as for the measured productivity. We 
conclude that the power-laws for productivity dispersion across workers, firms 
and industry obtained above are quite robust. 
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C. Summary of Empirical Observations 

We summarize the empirical observations as two stylized facts: 

I. The distribution of productivity obeys the Pareto distribution (i.e. the 
power-law for the high productivity group) at every level of aggregation, 
that is, across workers, firms, and industrial sectors. 

II. The Pareto index, namely the power exponent decreases as the level of 
aggregation goes up: fi w > ^ F > Ms (Figure [SJ. 

As we explained in section II, under the assumption that the distribution 
of productivity across firms is uniform leads to the exponential distribution of 
productivity across workers. Obviously, this model does not fit the empirical 
observations. In the next section, extending the basic framework, we present 
a theoretical model which explains the above stylized facts. 

IV. Theory 

In this section, we develop a new theoretical model which explains two stylized 
facts in three steps: First, we discuss the generic framework that explains 
the power-law distribution for firm's productivity. Secondly, we extend the 
basic model explained in section II by incorporating the stylized fact that 
productivity distribution across firms is not uniform, but rather obeys the 
power-law. Because the extended model still fails to explain the stylized 
facts, we take the third step; We propose the superstatistics framework. It 
can explain two stylized facts presented in section III. 

A. Firm's Productivity Dispersion: A Model of Jump Markov Process 

The standard economic analysis takes it for granted that all the production 
factors enjoy the highest marginal productivity in equilibrium. However, this 
is a wrong characterization of the economy. The fact is that production fac- 
tors cannot be reallocated instantaneously in such a way that their marginal 
products are equal in all economic activities. Rather, at each moment in time, 
there exists a dispersion or distribution of productivity as shown in the pre- 
ceding section. Evidently, an important reason why the marginal products 
of workers are not equated is, as Mortensen (2003) suggests, that there are 
differences in productivity across firms. To describe the dynamics of firm's 
productivity, we employ a continuous-time jump Markov process. Using the 
Markov model, we can show that the power-law distribution is a generic con- 
sequence under a reasonable assumption. 

Suppose that a firm has a productivity denoted by c. In a small time 
interval dt, the firm's productivity, c increases by a small amount, which 
we can assume is unity without loss of generality, with probability w + (c) dt. 
We denote it as w+(c) because this probability w+ depends on the level of 
c. Likewise, it decreases by unity with probability ui-(c)dt. Thus, w+(c) 
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and W-(c) are transition rates for the processes, c — ► c + 1 and c — > c — 1, 
respectively. 

We also assume that a new firm is born with a unit of productivity with 
probability pdt. On the other hand, a firm with c = 1 will be dead if its 
productivity falls to zero; Thus the probability of exit is W-(c = 1) dt. A set 
of the transition rates and the entry probability specifies the jump Markov 
process. 

Given this Markov model, the evolution of the average number of firms 
having productivity c at time t, n(c, t) obeys the following master equation: 

dn( £ - =w + (c-l)n(c-l,t) + w-(c+l)n{c+l,t) 

(34) - w + (c) n(c, t) - w- (c) n(c, t)+pS c ,i , 

Here, S Ct ± is 1 if c = 1 and otherwise. This equation shows that the change 
in n(c, t) over time is nothing but the net inflows to the state c. 
The total number of firms is given by 

oo 

(35) K t = ^n{c,t). 

c=l 

We define the aggregate productivity index C as follows: 

oo 

(36) C t = ^cn{c,t). 



It follows from (34) that 

(37) ^K t =p-w-{l)n{l,t), 

(38) ~C t =p-J2^-(c)-w + (c))n(c,t). 



We consider the steady state. It is the stationary solution of (34) such that 
dn(c,t)/dt = 0. The solution n(c) can be readily obtained by the standard 
method. Setting (37) equal to zero, we obtain a boundary condition that 
W- (1) n(l) = p. Using this boundary condition, we can easily show 

( 39 ) "(c)=»(i)n 7 (c :* } n - 

Next, we make an important assumption on the transition rates, w+ and 
w_ . Namely, we assume that the probabilities of an increase and a decrease 
of productivity depend on the firms current level of productivity. Specifically, 
the higher the current level of productivity is, the larger a chance of a unit 
productivity change is. This assumption means that the transition rates can 
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be written as u>+(c) = a + c a and W-(c) — a_ c a , respectively. Here, a+ and 
a_ are positive constants, and a is greater than 1. Under this assumption, 



the stationary solution (39 1 becomes 



where 

c* = Kl)/^,))- 1 and C (a) = V c Q n(c). 



oo 



We have used the relation a + /a_ = 1 — n(l)/C( Q ), which follows from 



(37) and (38). The approximation in (40) follows from n(l)/C( Q ) <C 1. The 
exponential cut-off works as c approaches to c*. However, the value of c* 
is practically quite large. Therefore, we observe the power-law distribution 
n(c) oc c~ a for a wide range of c in spite of the cut-off. We note that the power 
exponent /i for the empirical distribution presented in section III is related to 
a simply by 

ju, = a — 1. 

The present model can be understood easily with the help of an analogy 
of the formation of cities. Imagine that n(c, t) is the number of cities with 
population c at time t. u>+(c) corresponds to a birth in a city with population 
c, or an inflow into the city from another city. Similarly, w_(c) represents 
a death or an exit of a person moving to another city. The rates are the 
instantaneous probabilities that population of city with the current population 
c either increases or decreases by one. They are, therefore, the entry and exit 
rates of one person times population c, respectively. And a drifter forms 
his own one-person city with the instantaneous probability p. In this model, 
dynamics of n(c, t), namely the average number of cities with population c is 
given by equation (34). In the case of population dynamics, one might assume 
that the entry (or birth) and exit (or death) rates of a person, a + and a_ are 
independent of the size of population of the city in which the person lives. 
Then, w+(c) and u>_(c) become linear functions of c, namely, a+c and a_c. 
Even in population dynamics, though, one might assume that the entry rate 
of a person into big city is higher than its counterpart in small city because of 
the better job opportunity or the attractiveness of "city life." The same may 
hold for the exit and death rates because of congestion or epidemics. 

It turned out that in dynamics of firm productivity, both the "entry" 
and "exit" rates of an existing "productivity unit" are increasing functions 
of c, namely the level of productivity a part of which that particular unit 
happens to be; To be concrete, they are a + c and a_c. Thus, w+(c), for 
example, becomes a+c times c which is equal to a + c 2 . Likewise, we obtain 
w_(x) — a_x 2 . This is the case of a = 2, the so-called Zipf law (see Sutton 
(1997)). 



See |ljiri and Simon| l |l975| l; Marsili and Zhang (1998) for the formation of cities whose 
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There is also a technical reason why we may expect a to be larger than 



two. From (40 1, we can write the total number of firms and the aggregate 



productivity index as 



(41) 
(42) 



K 
C 



XOi — 1 



"(1) 

T(a)J e*-l + n(l)/C (a) ' 

n(i) r°° t a - 2 



-dt 



T(a-l)J e*-l+n(l)/C (a) 



-dt 



where T(z) is the gamma function defined by T(z) — f ( ^t z 'e' 
limit as n(l)/C( a ) goes to zero, the integral in (|4lj) is finite for a > 1, while 
the integral in (|42[) tends to be arbitrarily large for 1 < a < 2. Therefore, for 



the finiteness of both (41 1 and (42), it is reasonable to assume that a > 2. 



In summary, under the reasonable assumption that the probability of a 
unit change in productivity is an increasing function of its current level, c, we 
obtain power-law distribution as we actually observe. Now, economists are 
prone to take changes in productivity as "technical progress." That is why the 
focus of attention is so often on R & D investment. However, if productivity 
growth is always technical progress, its decrease must be "technical regres- 
sion," the very existence of which one might question. At the firm level, an 
important source of productivity change is actually a sectoral shift of demand; 
When demand for product A increases, for example, productivity at the firm 
producing A increases, and vice versa. |Fay and Medoff (19851 indeed docu- 



ment such changes in firm's labor productivity by way of changes in the rate 
of labor hoarding. Stochastic productivity changes which our Markov model 
describes certainly include technical progress, particularly in the case of an 
increase, but at the same time, represent allocative demand disturbances the 



importance of which Davis et al. ( 1996 ) have so persuasively demonstrated in 



their book entitled Job Creation and Destruction. The empirical observation 
on productivity dispersion and our present analysis suggest that the proba- 
bility of a unit allocative demand disturbance depends on the current size of 
firm or sector. 



B. The Stochastic Macro- equilibrium Once Again 

The basic framework explained in section II presumes that productivity dis- 
persion across firms, namely P> (c) is uniform. However, P> (c) actu- 
ally obeys the power-law. Having clarified the generic origin of power-law 
distribution for c, we now turn our attention to the productivity disper- 
sion across workers. We must extend the theory of the stochastic macro- 

(F) 

equilibrium explained in section II under the assumption that P> (c) is the 
power-distribution . 

sizes obey the Zipf law. Our present model of productivity dynamics has, in fact, a very 
close analogy to dynamics of city size. It is indeed mathematically equivalent to the model 
of Marsili and Zhang ( 1998 ) which analyzed dynamics of city size, except for an additional 
assumption that n(l)/C( a ) = p. 
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In order to develop the new theoretical framework, it is better to adapt 
the continuous notation we introduced previously. An example is 



00 /1OO 

(43) KJ2 n ( c k)^K P (F) (c)dc. 

fe=i J ° 

In the continuous model, equations ([5]), ([8| and ^ read 

1*00 

(44) S = -K I (p(c) lnp(c)) p( F \c)dc, 
(45) 



(46) 



K 



roo 

K / p(c)P^(c)dc = 1, 
cp(c)P (F) (c)dc = L>, 



respectively, Here, we have replaced pk — n^/N by continuous function p(c). 
Note that because the distribution of productivity across firms is no longer 
uniform, but is P( F )(c), the corresponding distribution across workers, p( w \c) 
is 



(47) 



pW(c) = Kp(c)P^(c). 



Using P( w )(c), we can rewrite equations J44|, (|45|) and (W6k as follows: 
(48) S = - 



P(W)( C)1 ^ (W) ( C 



(49) 
(50) 



P( p )(c) 
P (w) (c) dc= 1, 



g?c + [const.], 



/>oo 

/ cF (w) (c)dc = iJ. 
Jo 



Here, [const.] in equation (48) is an irrelevant constant term. 

We maximize S (equation ( 48 1 ) under two constraints (49 1 and (50) by 
means of calculus of variation with respect to P^ w )(c) to obtain 



(51) 



p(w)(c) = zM p(F)(c)e_/3c 



Here, as in section II, (3 is the Lagrangian multiplier for the aggregate demand 
constraint, (46), and the partition function Z{0) is given by 

P (F) (c) e~ Pc dc. 



(52) 



Z{0) 



It is easy to see that constraint (49 1 is satisfied. Constraint (50 1 now reads 

1 



(53) 



D 



Z(P) Jo 



cP (F) (c) e-^dc. 
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Figure 7: Productivity Distributions across Workers and 
Firms Notes: The solid curve is for firms with /ip = 
1.5, while the dashed curves are for workers with (3 = 
0.01,0.1,1, respectively. When j3 is small enough, the 
distribution across workers is close to that of firms. As 
j3 increases, the distribution across workers is suppressed 
for large c due to the exponential factor e~ l3c . 



This equation is equivalent to: 



(54) 



It is straightforward to see that equations (51 1 and (52 1 are the counter parts 



of equations (14 1 and (151, respectively, under the assumption that the pro- 



ductivity dispersion across firms is not uniform but is P> F '(c). 

Because the productivity distribution across firms, P( F \c) obeys the Pareto 
law, the relation between the aggregate demand D and (3 (or the tempera- 



ture) is not so simple as shown in equation (18 1, but is, in general, quite 



complicated. However, we can prove that the power exponent (3 is a decreas- 
ing function of the aggregate demand, D. Thus, the fundamental proposition 
that when the aggregate demand is high, production factors are mobilized to 
firms and sectors with high productivity (Figure |Tj), h olds true in the extended 
model as well. We provide the proof in |Appendix B 

Now, we explore productivity dispersion in this extended model. The 
productivity dispersion across workers, p( w )(c) relates to that across firms, 
P( F )(c) by way of equation (51). The latter obeys the power law. Figure M 
shows examples of the productivity distributions across workers for several 
different values of D, given a P( p )(c). The solid curve is for firms with /ip = 
1.5. The dashed lines are the corresponding distributions for workers with 
different values of D. All of them have strong suppression for large c due to 
the exponential factor e~^ c . They are in stark contrast to the power-law. Note 
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that the power-law corresponds to a straight line in the figure which shows the 
relation between In P > (c) and In c. The only way to reconcile the distribution 
( |51| with the observed power-law is to assume an extremely small value of (3 so 
that the Boltzmann factor e~ /3c becomes close to one, and does not suppress 
P( w )(c). This trick, however, does not work because, in this case, it yields 
the Pareto index /zw of the worker's productivity distribution equal to that of 
the firm's distribution, fip- It is inconsistent with the empirical observation 
that /iw > (Figure [5]) . Comparing Figures [3] and [7] we conclude that the 
extended model still fails to explain the observations. We must seek a new 
theoretical framework. 



C. Worker's Productivity Dispersion under Fluctuating Aggregate Demand 

The theoretical framework explained so far implicitly assumes that the aggre- 
gate demand, D is constant. Plainly, this is an oversimplification; D fluctu- 
ates. 

Macro-system under fluctuations of external environment can be analyzed 
with the help of superstatistics or "statistics of statistics" in statistical physics 
(Beck and Cohen, 2003| p| In this theory, the system goes through changing 



external influences, but is in equilibrium at the limited scale in time and/or 
space, in which the temperature may be regarded as constant and the Boltz- 
mann distribution is achieved. In other words, the system is only locally 
in equilibrium; Globally seen, it is out of equilibrium. In order to analyze 
such system, superstatistics introduces averaging over the Boltzmann factors. 
Depending on the weight function used for averaging, it can yield various 



distributions, including the power-law (Touchette and Beck 20051. 



GDP of a particular year is certainly a scalar constant when the year 
is over. However, it actually fluctuates daily if those fluctuations cannot 
be practically measured. Accordingly, macro environment surrounding firms 
keep changing almost continuously. Differences by region, industry and sector 
can also be taken care of by the super statistics approach. When the aggre- 
gate demand D changes, the new stochastic process conditioned by new D 
allows production factors to move to a different equilibrium. Averaged over 
various possible equilibria, each of which depends on a particular value of 
D, the resulting distribution becomes different from any of each equilibrium 
distribution. 

Specifically, in superstatistics, the familiar Boltzmann factor, exp(— (3c) is 
replaced by the following weighted average: 

/>oo 

(55) B{c) = / /(/3) 



Here, the weight factor /(/?) represents the changing macroeconomic environ- 
ment. Note that because (3 is a monotonically decreasing function of D, the 



6 There are several model cases where superstatistics was applied successfully. Among 
them, the Brownian motion of a particle going thorough changing environment provides a 



good analogy to our case ( Beck 2006 1 . 
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weight factor f((3) corresponds to changes in the aggregate demand. With 
this weight factor, the probability distribution of worker's productivity, (51 1 
is now replaced by the following: 



(56) 



P (W) (C)= 1 P (F) (c)jB(c) . 



Here, the partition function, Zb is also redefined as 



(57) 



Zr = 



P (F) (c)B(c)dc. 



We now examine whether P( w )(c) in equation (56 1 obeys the power-law 
for high productivity c. The integration in equation ( 55 1 is dominated by the 
small (3 (high demand) region for large c. We assume the following behavior 
of the pdf /(/?) for (3 -> 0, 

(58) /(/3)«/T 7 ( 7 <1), 

where the constraint for the parameter 7 comes from the convergence of the 
integration in equation (551. The proportional constant is irrelevant because 
p( w )(c) is normalized by Zb- This leads to the following B(c) for large c: 

(59) B(c) oc r(l -7)c 7 ~ 1 . 



Substituting ( 59 1 into Equation ( 56 1 , we find that the productivity distribu- 
tion across workers obeys the power-law; 

(60) ^Wocc-^- 1 , 
with 

(61) ^ w = Mf-7+1- 

Because of the constraint 7 < 1 , this leads to the inequality 

(62) /i W > ^i F . 

This agrees with our empirical observation (Figure [5] or stylized Fact II in 
section III.C). 

Because (3 is related to D by way of equation ( ]54| , the pdf fpifl) of j3 is 
related to pdf Jd{D) of the (fluctuating) D as follows: 



(63) 



f/}(0)d0 = f D (D)dD. 



As is noted several times, small (3 corresponds to high aggregate demand, D. 
In particular, the following relation hold^jfor (3 — + 0: 



(64) 



(c) - D oc 



(3 for 2 < /ip ; 

for 1 < ^ < 2. 



7 The proof is to be given on request. 
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Figure 8: The Distribution of the Aggregate Demand £>, 
Jd{D) (left) and the Corresponding Cumulative 
Productivity Distribution P^\c) (right) Notes: 
The solid curves are for S = — 1 whereas the broken 
curves are for S = —2. The productivity distribution 
of firms is chosen to have [If = 1-5- 



This leads tojf] 
(65) 



with 



(66) 



1 = 



f D (D)<x((c) -DY 



5—1 for 2 < (tx F ; 

(jij? - l)(5- 1) forl<^ F <2. 



Here, the parameter 6 is constrained by 



(67) 



8<1 



from the normalizability of the distribution of fn(D), which is consistent with 



the constraint 7 < 1 and equation ( 66 1 



Equation (65) means that changes in the aggregate demand, D follows the 



power-law. Gabaix (20051 indeed demonstrates that idiosyncratic shocks to 



the top 100 firms explain a large fraction (one third) of aggregate volatility 
for the U.S. economy. This is a characteristic of power-law. Any case, D is 
not constant, but rather fluctuates, now. Accordingly, the problem is not a 
relation between the productivity dispersion across workers and the level of 
D, but rather how the distribution depends on the way in which D fluctuates. 

Figure [8] shows an example of the relation between the distribution of D, 
/d(-D) near the upperbound (c)o and the cumulative productivity distribution 
of workers P> W ^ (c) . In the figure, the solid curves correspond to the large value 



8 At fip = 2, we need additional logarithmic factors for /(/3), but the power of (3 is 
essentially the boundary case between the above two, 7 = 8. 
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1 



Figure 9: Relation between uw and uf (68) Notes: The solid 
line is the relation ( 68 1 , and the filled circle is the data. 



of S whereas the broken curves to the small value. Figure [8] demonstrates 
that as the distribution of D becomes skewed toward large D, the tail of the 
productivity distribution becomes heavier. Roughly speaking, when D is high, 
the productivity dispersion becomes skewed toward the higher level, and vice 
versa. 



Combining equations (61 1 and (66 1, we reach the following relation be- 
tween the Pareto indices: 

{/i F — <5 + 1 for 2 < u F : 

(u F - 1)(-S + 1) + /if for 1 < Mf < 2. 
This relation between u-w and uf is illustrated in Figure [9] As noted previ- 



ously, because of the constraint S < 1 , equation ( 68 1 necessarily makes u-w 



larger than uf- This is in good agreement with our empirical finding. Inci- 



dentally, equation (68 1 has a fixed point at (/zvv,A*f) = ( 1 , 1) ; the line defined 
by equation ( 68 ) always passes through this point irrespective of the value of 
5. The Pareto index for firms is smaller than that for workers, but it cannot 
be less than one, because of the existence of the fixed point (1,1). 

The superstatistics framework presented above may apply for any adjoin- 
ing levels of aggregation; Instead of applying it for workers and firms, we may 
apply it for firms and industrial sectors. Then, we can draw the conclusion 
that as we go up from firms to industrial sectors, the Pareto index again goes 
down, albeit for a different value of S. This is illustrated in FigflO] Because of 
the existence of the point (1, 1), as the aggregation level goes up, the Pareto 
index is driven toward 1, but not beyond 1. At the highest aggregation level, 
it is expected to be close to one. This is again in good agreement with our 
empirical finding that the Pareto index of the industrial sector /ig is close to 
one (see Figure [5]). 

In summary, the superstatistics framework successfully explains two em- 
pirical findings we have summarized in section 111. C. Furthermore, given the 



measured values of u-w and /xp, the relation (68 1 can be used to determine 
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Figure 10: Changes of the Pareto Index as the Aggrega- 
tion Level changes in Two Steps, each with a 
Different Value of 8 



the value of 8: 



(69) 8 = 



Mf — + 1 for 2 < ^p; 
Mf — Mw 



Mf - 1 



1 for 1 < ^ F < 2. 



The result is shown in Figure [IT] Recall that 8 is the power exponent of the 
distribution of aggregate demand, D. Therefore, low 8 means the relatively 



low level of the aggregate demand. In Figure 11 we observe that the aggregate 
demand was high during the late 1980's, while beginning the early 90's, it 
declined to the bottom in 2000-2001, and then, afterward turned up. It is 
broadly consistent with changes in the growth rate during the period. 

V. Implications 

The standard economic analysis takes it for granted that production factors 
move fast enough from low to high productivity firms and sectors, and that 
as a consequence, sooner or later they enjoy the same (highest) marginal pro- 
ductivity; Otherwise, it contradicts the concept of equilibrium. However, we 
have some evidences suggesting that there is always productivity dispersion 
in the economy. As we referred to in Introduction, Mortensen (2003) analyz- 
ing wage dispersion argued that there is productivity dispersion across firms. 



Okun ( 1973 1 also argued that a part of the reason why we obtain the Okun's 
law is "the upgrading of workers into more productive jobs in a high-pressure 
economy." 

In this paper, we have provided a solid foundation for Okun's argument. 
The most important point of Okun's argument and also our present analysis 
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Figure 11: The values of 8 calculated from Equation (68) 
for the Japanese Listed Firms 



is that the allocation of production factors is not independent of the level of 
the aggregate demand. Rather, it depends crucially on the aggregate demand. 
As we explained in section II, the fundamental principle of statistical physics 
indeed tells us that it is impossible for production factors to achieve the same 
(highest) level of productivity. Rather, we must always observe the distri- 
bution of productivity in the economy. Moreover, the theory indicates, it is 
exponential distribution (the Maxwell-Boltzmann) , the exponent of which de- 
pends inversely on the level of aggregate demand. As the aggregate demand 
rises, production factors are mobilized to high-productivity firms and sectors 
just as Okun argued. 

In section III, we showed that there exist indeed the distributions of pro- 
ductivity across workers, firms, and sectors. A serious problem for the theory 
of stochastic macro-equilibrium is, however, that the observed distribution 
of productivity across workers is not exponential, but obeys the power-law. 
We reconciled this empirical observation with the basic theory by introduc- 
ing the assumption that productivity dispersion across firms obeys the power 
law rather than is uniform, and also another plausible assumption that the 
aggregate demand is not constant, but fluctuates. 

In section IV, first, we explained how the power distribution of produc- 
tivity across firms arises in a simple stochastic model. To obtain the power 
law distribution, we need to make a crucial assumption that the higher the 
current level of productivity is, the greater the probability of either a unit 
increase or decrease of productivity is. Obviously, the probability of either a 
birth or a death is greater in a city with large population than in a small city. 
By analogy, the above assumption seems quite natural. 

The "size effect" in productivity (or TFP) growth has been much discussed 
in endogenous growth theory because its presence evidently contributes to 



endogenous growth. See, for example, Solow (2000). In growth theory, an in 
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crease in productivity is mostly identified with pure technological progress so 
that it is directly linked to R & D investment. However, to obtain the power 
distribution of productivity across firms, we must assume the significant prob- 
ability of decrease in productivity. This suggests strongly that productivity 
changes facing firms are caused not only by technical progress, but also by 
the allocative disturbances to demand. Incidentally, Davis, Haltiwanger, and 
Schuh (1996) report that unlike job creation, job destruction for an industry is 
not systematically related to total factor productivity (TFP) growth; Namely, 
job destruction occurs in high TFP growth industries as frequently as in low 
TFP growth industries (their Table 3.7 on page 52). This fact also suggests 
the presence of the significant demand reallocation. 

As Davis, Haltiwanger and Schuh (1996) rightly emphasize, the allocative 
demand shocks play a very important role in the macroeconomy. At the same 
time, the aggregate demand also plays a crucial role because the allocation 
of resources and production factors depends crucially on the level of aggre- 
gate demand. Put it simply, the frontier of production possibility set is a 
never-never land. The higher the level of aggregate demand is, the closer the 
economy is to the frontier. 
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Appendix A. Notes on Data used in section III. A 

As noted in section III, we used the Nikkei-NEEDS (Nikkei Economic Elec- 
tronic Databank System) database for analysis of empirical distribution of 



the productivity. This database is a commercial product available from Nikkei 



Media Marketing, Inc. ( |2008 1 and contains financial data of all the listed firms 



in Japan. As such, it is a well-established and representative database, widely 
used for various purposes from research to practical business applications. For 
our purpose, we used their 2007 CD-ROM version and extracted data for the 
period between 1980 and 2006. It covers some 1,700 to 3,000 firms and 4 to 
6 million workers. 

We have found that in certain cases, the productivity calculated is unre- 
alistically large. For example, firms that became stock-holding firms report 
huge reductions of the number of employees, while maintaining the same order 
of revenues in the year. This results in absurd values of labor productivity c 
for that year. Because of these abnormalities, we have excluded top-ten firms 
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Figure 12: The Relation between the Aggregate Demand, 
D and the Temperature, T (= 1/(3) Notes: For a 
productivity distribution of firms with fip = 1.5. 



in terms of the productivity each year. This roughly corresponds to exclud- 
ing firms with productivity c > 10 9 yen/person. We experimented analyses 
with several different cuts, i.e., with cutting top-twenty firms, and so on. The 
results obtained remained basically the same as reported in the main text. 
The values of the Pareto indices /iw, Mf, A*s given in this paper are deter- 



mined by fitting the data with the GB2-distribution described in Kleiber and 



Kotz (20041 by the maximum likelihood method. 



Appendix B. The Relationship between Aggregate Demand D 
and the exponent (3 

In this appendix, we prove the following three basic properties (i)-(iii) of the 
temperature-aggregate demand relation (54 1 in Macro-Equilibrium: 



(i) The temperature, T = 1//3 is a monotonically increasing function of the 



aggregate demand, D. We can prove it using equation (54 1 as follows: 

d 2 



(70) 



dD 
dT 



where (c n )p is the n-th moment of productivity defined as follows: 



(71) 



i p( p )( c ), 



-/3c 



dc. 



Comparing (53) and (56), we know that (c)p = D. This is a natural 
result. As the aggregate demand D rises, workers move to firms with 
higher productivity. It corresponds to the higher temperature according 
to the weight factor e~^ c . 
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(ii) For T -> oo), 

(72) L> 0. 

Thi s is evident from the fact that in the same limit the integration in 
Eq.(53) is dominated by c ~ due to the factor e _,3c , and the integrand 
has extra factor of c compared to the denominator Z(J$). 

(iii) For T~> oo (/?-> 0), 

/» CO 

(73) / cP^(c)dc(=(c) ). 



This can be established based on the property (i) because D — (c)p 
(c) as and Z(0) = 1. 



An example of the relation between D and T = 1//3 is given in Figure 12 



References 



Beck, C. 2006. "Superstatistical Brownian motion." Progress of Theoretical 
Physics. Supplement, 162: 29-36. 

- and E. G. D. Cohen. 2003."Superstatistics." Physica A, 322: 267-275. 

Champernowne, D.G. 1973. The Distribution of Income between Persons, 
Cambridge: Cambridge University Press. 

Davis, S.J., J.C. Haltiwanger, and S. Schuh. 1996. Job Creation and 
Destruction, Cambridge, MA: MIT Press. 

Fay, Jon A. and James L. Medoff. 1985. "Labor and Output over the 
Business Cycle: Some Direct Evidence." American Economic Review, 75: 
63-655. 

Gabaix, Zavier. 2005. "The Granular Origins of Aggregate Fluctuations." 
MIT and NBER, DP. 

Ijiri, Yuji and Herbert A. Simon. 1975. "Some Distributions Associated 
with Bosc-Einstcin Statistics." Proceedings of the National Academy of Sci- 
ences of the United States of America, 72(5): 1654-1657. 

— and — . 1977. Skew Distributions and the Sizes of Business Firms, Ams- 
terdam: North-Holland Pub. Co.. 

Keynes, J.M. 1936. The General Theory of Employment, Interest, and 
Money, London: Macmillan. 

Kleiber, C. and Samuel Kotz. 2004. Statistical Size Distributions in Eco- 
nomics and Actuarial Sciences, Hoboken, New Jersey: John Wiley and 
Sons, Inc.. 



28 



Marsili, M. and Y.-C. Zhang. 1998 "Interacting individuals leading to 
Zipf's law." Physical Review Letters, 80(12): 2741-2744. 

Mortensen, D. T. 2003. Wage Dispersion, Cambridge, MA: MIT Press. 

Nikkei NEED S CD-ROM. 2008. Nikkei Media Market- 
ing, Inc.. |http://www.nikkeimm.co .jp/engli sh/index.html| and 

|http://www.nikkeimm.co.jp/service/macro/needs/con_need s-cd.Iitml 
(in Japanese, accessed May 12, 2008). 

Okun, A. M. 1962. "Potential GNP: Its Measurement and Significance." in 
"Proceedings of the Business and Economic Statistics Section" American 
Statistical Association, 98-104. 

— . 1973. "Upward Mobility in a High-Pressure Economy." Brookings Papers 
on Economic Activity, 1: 207-261. 

Solow, Robert M. 2000. Growth Theory: An Exposition, second ed., New 
York: Oxford University Press. 

Sutton, John. 1997. "Gibrat's Legacy." Journal of Economic Literature, 
March: 40-59. 

Tobin, J. 1972. "Inflation and Unemployment." American Economic Review, 
62: 1-18. 

Touchette, H. and C. Beck. 2005. "Asymptotics of superstatistics." Phys- 
ical Review E, 71(1): 16131-1-6. 



29 



